%!TEX root = ../book.tex
\chapter{Task execution}\label{chap:taskexecution}

In its most basic implementation, sensors and actuators can be directly tied to each other, removing the need for computation. Such robots are purely reactive, thereby missing the ability to ``think'' or plan. In order to achieve more complex behavior, memory and state are needed to switch between different controllers and algorithms.

This chapter introduces these basic principles as well as their implementation, starting with basic reactive controllers (\cref{sec:braitenberg}), then introduces more advanced concepts that let the robot make basic ``if'' \ldots ``then'' decisions using Finite State Machines (FSM) in \cref{sec:fsm,sec:stateflow}, and finally introduce advanced concepts such as behavior trees and semantic planning in \cref{sec:behaviortrees,sec:strips}.
The goals of this chapter are to introduce:

\begin{itemize}
    \item the basic control loop that allows robots to react to their environment, 
    \item ways to introduce state that allow robots to switch their behavior, 
    \item basic concepts that allow the robot to reason about its discrete states and pick the next action. 
\end{itemize}

\section{Reactive control}\label{sec:braitenberg}

A wide variety of robotic behaviors can be accomplished by directly connecting sensor input to actuator output. These behaviors can even be accomplished without a computer, that is, using analog electronics that provide appropriate conditioning.
Simple autonomous robots that use these concepts have been demonstrated as early as 1953 \cite{walter1953living} and have become known as ``tortoises''. For example, by tying the output of a light sensor to a motor controller, the motor turns faster when the light is brighter. Using an inverse relationship between sensor and motor, the motor will turn slower when the light is brighter. When used in a differential wheel configuration with two motors and two light sensors, such a robot can be built to either drive toward or away from the light.

Formally, we can express the light following behavior---also known as \textsl{phototaxis}\index{Phototaxis}---via the following relationship between
the left and right wheel speeds $\dot{\phi_l}$ and $\dot{\phi_r}$ and the measurements of the right and left light sensors $\lambda_r$ and $\lambda_l$:
\begin{eqnarray}\label{eq:simplereactive}
\dot{\phi_l}&=&a \lambda_r + b\quad \\
\dot{\phi_r}&=&a \lambda_l + b\quad,
\end{eqnarray}
with $a$ a constant weight, and $b$ a bias term. We observe that the left wheel turns faster the brighter the light shines on the right sensor. If the right light sensor receives more light than the left sensor the right wheel will turn slower, thereby exhibiting a phototaxis behavior that results in a right turn.

\begin{figure}
    \centering
    % \includegraphics[width=0.7\columnwidth]{figs/braitenberg.png}
    \def\svgwidth{0.7\textwidth}
    \import{./figs/}{braitenberg.pdf_tex}
    \caption{Two vehicles approaching a light source. The brighter the light, the more does each motor turn. The left vehicle will therefore approach the light by turning toward it, the right vehicle will avoid it by turning away from it.\label{fig:braitenberg}}
\end{figure}

A more complex reactive behavior is obstacle avoidance. Assuming the output of an obstacle sensor increases as the obstacle nears (e.g., an infrared proximity sensor), we can use the same principle to compute the wheel speeds such that the obstacle is actively avoided. An example for a differential-wheel robot with eight infrared proximity sensors is illustrated in  given by:
\begin{eqnarray}
\nonumber
\dot{\phi_l}&=&-6d_0-6d_1-19d_2-13d_3+94d_4+63d_5-50d_6-6d_7+b\\
\nonumber
\dot{\phi_r}&=&-6d_0+50d_1+63d_2+94d_3-22d_4-10d_5-6d_6-6d_7+b \quad,
\end{eqnarray}
%
% back, left {-0.06, -0.06},
% left {-0.06, 0.5},
% front, left {-0.19, 0.63},
% front, front, left{-0.13, 0.942}
% front, front, right {0.942, -0.22}, % left right motor
% front, right {0.63, -0.1},
% right {0.5, -0.06},
% back, right {-0.06, -0.06},
%
with the eight sensors $d_0 \ldots d_7$ arranged similarly to the E-Puck differential wheel robot in \cref{fig:epuckir}: $d_0$ as the left rearward sensor and the other sensors being arranged clockwise such that $d_7$ is the right rearward one.

\begin{figure}
    \centering
    % \includegraphics[width=0.9\columnwidth]{figs/ePuck.png}
    \def\svgwidth{\textwidth}
    \import{./figs/}{epuck.pdf_tex}
    \caption{A schematic of a differential-wheel robot with eight infrared distance sensors (left) and typical sensor response as a function of distance (right).}
    \label{fig:epuck}
\end{figure}

Behaviors such as phototaxis and obstacle avoidance can also be combined by simply weighing each input accordingly. This idea has been popularized by the neuroscientist Valentino Braitenberg, who augmented this system with additional ideas around basic forms of learning (by changing the weights based on events such as collisions), natural selection (building robots with random weights and selecting those that perform best), and analogies to the human brain \cite{braitenberg1986vehicles}. Controllers of these kind are therefore often called ``Braitenberg vehicles''.
%
Indeed, the controllers above bear strong resemblance to artificial neural networks such as those described in \cref{chap:ann}, and ``optimal'' values to obtain a certain behavior can be obtained using evolutionary computation \cite{floreano1998evolutionary} or by training a neural network on recordings of input/output pairs that correspond to a desired behavior.

There are numerous variants of the control architecture including the \textsl{subsumption architecture} \cite{brooks1990elephants} and \textsl{motor schemas} \cite{arkin1989motor} that propose variations of switching different components of a reactive controller on and off to obtain desired behaviors.
However, while useful for achieving relatively simple behaviors and capable of exhibiting more complex, emergent ones, these approaches are difficult to manage in practice and are better solved by being embedded in high-level control frameworks.

\subsection{Limitations of reactive control}

The limitations of a reactive control scheme are evident when considering that a robot combining both phototaxis and obstacle avoidance will still get stuck if presented with a U-shaped obstacle (\cref{fig:uobstacle}). While obstacle avoidance will prevent the robot from hitting the obstacle, as soon as the way is clear, the robot will keep turning toward the light, thereby getting stuck in a loop. This type of behavior can also be observed in insects such as flies or moths.

\begin{figure}
\centering
    % \includegraphics[width=0.8\columnwidth]{figs/uobstacle.png}
    \def\svgwidth{0.64\textwidth}
    \import{./figs/}{uobstacle.pdf_tex}
    \caption{\label{fig:uobstacle}A differential wheel robot with distance and light sensors wired in a ``light following'' configuration in an U-shaped obstacle. Although the obstacle will be avoided, the light following behavior will continuously drive the robot into the obstacle unless state is added.}
\end{figure}

In order to avoid this situation, the robot needs to memorize its previous state and switch behaviors accordingly. For example, in addition to the basic combined avoidance and following behavior (``avoid and follow''), we can introduce an additional term (``wall following'') in which the robot uses its proximity sensors to maintain a constant distance to a wall. In order to switch from one to the other behavior, we need to change the constant gains into dynamic ones that change their value based on other observations the robot makes. For example, the robot could estimate its progress by monitoring whether its light sensor is constantly increasing, and if it is not, inhibiting phototaxis behavior and emphasizing ``wall following''.

Designing reactive systems with time-dependent behavior and state is potentially realizable with very simple electronics; for this reason, we've seen real-world implementations of such mechanisms, e.g. in robot vacuum cleaners.
However, it very quickly becomes difficult to manage. It is therefore desirable to establish discrete abstractions for various behaviors, which can be more easily managed and understood by a programmer.

\section{Finite State Machines}\label{sec:fsm}

A simple yet powerful tool to facilitate switching between different behaviors is a so-called \textsl{Finite State Machine} (FSM)\index{Finite State Machine}\index{FSM}. In an FSM, each state is associated with a specific controller. In practice, an FSM consists of a global variable that stores the current state and a series of ``if'' statements that contain the code that is associated with each unique state. For example, an FSM to perform phototaxis while avoiding U-obstacles could consist of four states, with one for each desired behavior: one state that computes wheel-speeds so that the robot moves toward the light, another to use its sensors to avoid obstacles ahead of it, a state that computes wheel-speeds so the robot performs a ``wall following'' behavior for a fixed amount of time, and finally a state to stop the robot. An example of these states is shown in \cref{fig:fsm}.

\begin{figure}
\centering
    % \includegraphics[width=0.8\columnwidth]{figs/uobstacle.png}
    \small
    \def\svgwidth{0.9\textwidth}
    \import{./figs/}{FSM.pdf_tex}
\caption{A simple Finite State Machine (FSM) with four states. The final state is colored in gray; the initial state is set apart with a double circle.\label{fig:fsm}}
\end{figure}

To specify an FSM, one also needs to specify the \textsl{state transitions}\index{state transition}\index{transition (state)}, i.e. the conditions that determine when to switch states. For example, if multiple sensors detect an obstacle (implying that it may be a large one), then it may be desirable to have the FSM transition from its first state (phototaxis with simple obstacle avoidance) to its second (avoid obstacles). Should the light measurement decrease, such as when the robot needs to make a u-turn in order to avoid the obstacle, the behavior should transition to ``wall following''. Once the light increases again, such as when the robot has gone around the obstacle, it resumes light following. Once the light sensor exceeds a threshold (``bright enough''), the robot stops.

Finally, it is necessary to specify an initial state (the state the system starts in) and any number of final states (terminal states that signify program termination). In the example in \cref{fig:fsm}, the program will always start in light following mode and terminate once the state labeled ``stop'' is reached.

Formally, a FSM is defined by a Tuple $(\Sigma, S, s_0, \delta, F)$ where:
\begin{itemize}
\item $\Sigma$ is the input \textsl{Alphabet}, i.e. a set of symbols that represent events that can trigger state transitions,
\item $S$ is a finite set of states,
\item $s_0$ is an initial state and an element of $S$, i.e. $s_0 \in S$,
\item $\delta$ is the state-transition function $\delta: S \times \Sigma \rightarrow S$ that maps combinations of states in $S$ and symbols $x$ in $\Sigma$ to a new state in $S$, and
\item $F$ is the set of final states, a subset of $S$.
\end{itemize}

Historically, this definition stems from FSMs formally defining the working of a computer with a stream of symbolic commands of an actual program. In robotics, symbols that trigger state transitions can themselves be the result of complex computations. An example of such is the robot switching to wall-following if it has not made actual progress toward its goal in some time and resumes phototaxis once it reached a position that is closer to the light than it was before.

In conjunction with a controller for each state, an FSM is called a \textsl{Hybrid System}\cite{van2000introduction} as it combines both discrete (the state) and continuous (the controller outputs) variables.

\subsection{Implementation}

A low-level robot controller is usually implemented as a loop with fixed loop time, for example $100ms$ for slow moving differential-wheel robots or $1ms$ for dynamical systems such as drones or humanoid robots. At each start of the loop, the controller reads all sensors, then branches into the part of the code that corresponds to its current state, processes sensor information and computes actuator output, and finally sends the control commands to the actuators.

Unlike a computer program that can process information as fast as possible, a robot controller needs to wait until sensor information are actually available and actuator commands are executed (i.e. that the robot has physically moved). As the robot keeps moving while computation is ongoing, it is important to run the main loop at a constant rate. As computation is usually much faster than the loop time, it might be necessary to use an internal clock to wait until the loop time is completed.


%@book{van2000introduction,
%  title={An introduction to hybrid dynamical systems},
%  author={Van Der Schaft, Arjan J and Schumacher, Johannes Maria},
%  volume={251},
%  year={2000},
%  publisher={Springer London}
%}

It is helpful to capture all the FSM's state transitions in a drawing as shown in \cref{fig:fsm}. In practice, FSMs are difficult to develop, debug, and maintain. As the controller is being developed and experimented with, edge cases require the addition of transitions and states. With $N$ states, there are possibly $NxN$ state transitions, and it is typical to discover necessary additional state transitions as the FSM is specified. FSMs with many transitions become difficult to depict graphically, making it difficult to visualize what the program will actually do.

Whenever a state is added or removed, the programmer has to identify transitions required for the new one or modify all other states that have transitions to the one being removed, further contributing to FSM maintenance difficulty. Although behaviors such as obstacle avoidance are generic, each state also contains transitions that are specific to an application, potentially making it difficult to reuse states in other FSMs (modularity). 
%
FSMs also have difficulties with state transition conditions that cannot be evaluated in a single time step, e.g. when averaging the gradient of the light sensor to robustly detect an increase or decrease. In this case, these computations need to be carefully woven into the state execution code, adding complexity and making maintenance difficult.

\section{Hierarchical Finite State Machines}\label{sec:stateflow}

In order to make FSMs more manageable and to deal with information that needs to be processed at different time-scales, states can be grouped into clusters---each with their own associated FSM, thereby creating ``super-states'' organized in hierarchical fashion. This construct is usually referred to as Hierarchical FSM (FSM) but also known as ``Statechart'' \cite{harel1987statecharts}.
Considering the example in \cref{fig:fsm}, each state might as well be a super-state: e.g., the ``follow wall'' state may consist of an FSM that deals with an edge case such as rounding sharp corners. An example HFSM is depicted in \cref{fig:hfsm}. State transitions between super states can be tied to states in the included FSM or be implicitly connected to all states of the included FSM, which allows leaving the super state from every state therein.

Super states can also be executed in parallel, providing events that lead to state transitions in other FSMs. For example, detecting whether a robot still makes progress toward a light while avoiding an obstacle might require computing a running average and dropping outliers. This is illustrated by two super-states, one for the actual light-following behavior and the other for computing a running average of the light measurement, rejecting outliers, and generating symbols that can be consumed by the light following behavior. In \cref{fig:hfsm}, we are using the character ``/" to delineate state transition conditions such as ``Light decreases'' and the symbol that is generated during the state transition (here ``LD''). These symbols can then drive state transitions in other clusters.

\begin{figure}
\centering
    \includegraphics[width=1.0\textwidth]{figs/HFSM.pdf}
    % \tiny
    % \def\svgwidth{1.0\textwidth}
    % \import{./figs/}{HFSM.pdf_tex}
\caption{A hierarchical FSM with the states from \cref{fig:fsm} as super states, a more sophisticated wall-following behavior, and signal processing for averaging the light measurement being performed in parallel. \label{fig:hfsm}}
\end{figure}

\subsection{Implementation}

In practice, HFSMs are implemented in distinct processes that run independently and asynchronously. They can communicate using an inter-process communication (IPC) framework \index{Inter-process communication}\index{IPC} such as XMLRPC or REST, which are socket-based networking protocols that allow to exchange eXtended Markup Language (XML) or JavaScript Object Notation (JSON) data structures between two processes on the same or different computers using a networking interface.
%
There exist many IPC frameworks that are particularly targeted at robotics and introduce abstractions for robot-specific data structures such as coordinate frames or video streams, the associated tools to manage them, and bindings for different languages to publish and subscribe to them. A prominent example is the Robot Operating System (ROS)\index{Robot Operating System}\index{ROS}.

HFSMs solve some of the problems of FSMs by increasing modularity and simplifying programmability, but still have the problem that $N$ states can lead to $N^2$ state transitions, each of which needs to be manually coded.

\section{Behavior Trees}\label{sec:behaviortrees}

A Behavior Tree \cite{colledanchise2018behavior} provides structure for hierarchically organizing the decision-making flow of a system that makes many of the considerations that need to be explicitly coded in an FSM implicit instead. The leaves of a Behavior Tree are ``Action Nodes'' that can represent actual discrete behaviors, such as ``Close Gripper'' or ``Find Block''. The root and internal nodes of the Behavior Tree are made up of ``Utility Nodes'' that guide the path of traversal through the tree.
What equates to manual addition and removal of state transitions in an FSM can often be accomplished in a Behavior Tree by simply changing the type of an utility node from one to another. Another powerful aspect of this abstraction is that Behavior Trees specifying complex behaviors, such as ``Navigate to Kitchen", can be encapsulated within a single node of another tree.

\subsection{Node Definition and Status}

In a traditional implementation, the nodes within a Behavior Tree can return any of three statuses when queried: ``Success'', ``Failure'', or ``Running''. The incorporation of a ``Running'' status allows the Behavior Tree to use behaviors that operate over longer time periods, such as a block picking behavior that persists over multiple control cycles of a robot's main processing loop, including time required to plan the end-effector's path, the time required to physically move the robot to the destination, and the time required to close the gripper. In this example, the node might return ``Failure'' if any of the individual behaviors didn't work or if the end-effector didn't successfully grasp the block by the end of the behavior, and ``Success'' otherwise. Thus, each node in the Behavior Tree needs a rigidly defined notion of ``Success'' or ``Failure'' that can be propagated throughout the Behavior Tree, informing which sequence of behaviors is executed to achieve the desired result.

Unlike the Finite State Machine formalism that didn't incorporate an explicit notion of time, the ``Running'' status enables nodes to operate using the information that their child nodes may take variable amounts of time, with each discrete unit of time defined as a \textsl{tick}. This design choice simplifies the specification of control flow, and dramatically reduces the number of explicit transitions that are needed to model a system. Suffice to say, for a robot with a $100ms$ control loop, many of the discrete behaviors that a programmer would be interested in (such as turning $180$ degrees or moving forward one meter) are likely to require more than a single program cycle and will have action nodes that run for multiple ticks.

Nodes may also be parameterized, allowing for information computed from one node to be passed on and used in a subsequent node. Consider building a Behavior Tree for sorting blocks on a table into bins by color: one way to organize this Behavior Tree is repeating the sequence of behaviors ``Find Block'', ``Pick Block'', ``Get Block Color'', ``Place Block in Bin'' until no blocks remain. In this case, the behaviors ``Get Block Color'' and ``Place Block in Bin'' are connected, since the color of the block will determine which bin it should be placed in. This potential for interaction between nodes allows for powerful expressiveness of complex behaviors.

\subsection{Node Types}

Within a Behavior Tree, nodes can generally be classified based on their connectivity (do they have children, and if so, how many?) and function (is this a utility node that determines control flow, or is it an action node executing the action itself?). The three primary node types are \textsl{composite}, \textsl{decorator}, and \textsl{action}.

\textsl{Composite nodes} have one or more children and are responsible for regulating the control flow. Three important examples of composite nodes are the \textsl{sequence node}, \textsl{selector/fallback node}, and \textsl{parallel node}. A sequence node executes each of its child nodes in order, returning ``Failure'' if a single one fails and ``Success'' after all have finished successfully. A selector (or fallback) node executes each of its child nodes in order, but returns ``Success'' once a single child node succeeds, only returning ``Failure'' if all child nodes have failed. Sequence nodes can be thought of as analogous to an \textsl{AND} conditional statement, while selector nodes are similar to an \textsl{OR} conditional statement. A parallel node has $N>1$ children and attempts to execute its child nodes in parallel, returning ``Success" if $M$ or more children succeed and failure if more than $(N-M)$ children fail, for any choice of $M\leq N$.

\textsl{Decorator nodes} have exactly one child node and perform transformations on its child node's outputs back to its parents. An example of a simple decorator node is the \textsl{Inverter}, a node that inverts the return status of its child, effectively producing a \textsl{NOT} operation: if the child node returns ``Success'' then the decorator returns ``Failure'', and vice versa. Another useful decorator node is one that returns ``Success'' when its child returns a status of either ``Success'' or ``Failure'', allowing for the inclusion of action nodes where success is not critical for the behavior. Decorator nodes can also be designed to repeat the execution of its child, for instance until it returns a ``Success'' status, until it returns a ``Failure'' state, or endlessly (which is typically placed as the tree's root node to ensure continuous operation).

\textsl{Action nodes} have zero child nodes, and represent the execution of a discrete behavior. These nodes can use input parameters, return output values, and generally have any amount of complexity that the designer desires to program within them. Crucially, an entire Behavior Tree can be treated as a single action node, allowing for the composition of multiple Behavior Trees to build arbitrarily complex behaviors!

\begin{table}[h!]
	\centering
	\begin{tabular}{ |c c c| }
		\hline
		Node Class & Node Type & Symbol \\
		\hline
		Composite & Sequence & \fbox{$\rightarrow$} \\
		Composite & Selector/Fallback & \fbox{?} \\
		Composite & Parallel & \fbox{$\rightrightarrows$} \\
		Decorator & Decorator & $\Diamond$ \\
		Action & Action & \fbox{Text} \\
		\hline
	\end{tabular}
	\caption{Common Behavior Tree nodes and their symbols.}
\end{table}

\subsection{Behavior Tree Execution}

For each unit of time (e.g., control cycle) that passes, a preorder tree traversal occurs where nodes are recursively visited and evaluated left-to-right, commonly described as \textsl{propagating a tick} signal through the tree. In doing so, each parent node calls on its child nodes in order to retrieve their status. If a child node returns ``Success'', the parent node will move on to its next child node. If a child node returns a status of ``Running'' then the parent node will return ``Running'' without moving on to the next child node unless it permits running multiple child nodes in parallel. If a child node returns a status of ``Failure'', its behavior will depend on the type of node of the parent, for example returning ``Failure'' if the parent is a sequence node or moving on to the next child node if it is a selector node.

Consider the example of a robotic manipulator inserting a peg into a hole in \cref{BTpeg}: the first tick through the tree will trigger the \textsl{Move Peg to Surface} action. Subsequent ticks will be absorbed into this action until it returns a status other than ``Running'', at which point the next action will be triggered and the same absorption of ticks will occur in the new action being executed. If any single action node returns a status of ``Failure'', the entire behavior will result in a ``Failure'' status. A slightly more complex behavior tree is demonstrated via the \textsl{Pick Square Peg} behavior shown in \cref{fig:BTpick}, which allows the robot to check whether or not it is holding a peg already, only moving its gripper for the pick action if this check fails.

\begin{figure}
    \centering
    \begin{forest}
        {for tree={%
                minimum height = 4ex,
                minimum width = 4ex,
                draw,
                parent anchor=south,
                child anchor=north,
                align=center
            }
        }
        [{\scriptsize Insert Square Peg}\\ $\longrightarrow$
        [{\scriptsize Move Peg \\ \scriptsize to Surface}]
        [{\scriptsize Move Peg\\ \scriptsize over hole}]
        [{\scriptsize Rotate Peg\\ \scriptsize until aligned}]
        [{\scriptsize Move Peg \\ \scriptsize down until \\ \scriptsize new contact}]
        ]
    \end{forest}
    \caption{Behavior Tree for a peg insertion task. A sequence node triggers the execution of movement actions that align a peg over a hole and lower it until the gripper makes contact with the surface.}
    \label{BTpeg}
\end{figure}

\begin{figure}
	\centering
	\begin{forest}
		{for tree={%
				minimum height = 4ex,
				minimum width = 4ex,
				draw,
				parent anchor=south,
				child anchor=north,
				align=center
			}
		}
		[{\scriptsize Pick Square Peg}\\ $?$
		[{\scriptsize Sense if peg already in gripper}]
		[{\scriptsize Get Peg\\ $\rightarrow$}
		[{\scriptsize Locate Peg}]
		[{\scriptsize Open Gripper}]
		[{\scriptsize Move Gripper to Peg}]
		[{\scriptsize Close Gripper \\ \scriptsize around peg}]
		]
		]
	\end{forest}
	\caption{Behavior Tree for a peg picking task. First, the selector node will check if a peg is already in the robot's gripper. If it is not, the first action node will fail and the \textsl{Get Peg} behavior will execute, returning ``Success'' only if the gripper is successfully closed around the peg.}
	\label{fig:BTpick}
\end{figure}

\subsection{Implementation}

As the execution of a Behavior Tree is fundamentally a tree traversal, a tree is the ideal data structure to store composite, decorator, and action nodes. As a large part of the mechanics of these nodes remain the same, nodes are usually implemented as classes (in an object-oriented programming sense), which need to be inherited, modified, and instantiated by the programmer. Programming a complex robotic system therefore starts with defining and implementing the basic action nodes and then recombining them with appropriate composition nodes and decorators until the robot performs as expected.

\section{Mission Planning}\label{sec:strips}

So far, we have seen how reactive behaviors can be composed into more complex programs using Finite State Machines and Behavior Trees. Although Behavior Trees facilitate dealing with the exploding number of possible state transitions by making them implicit, the programmer still needs to define the entire program flow. Consider again a pick-and-place task. This time, we will not simply grasp a new item in case the object falls out of the hand, but try to find it on the table and try to pick it up from there. In a more elaborate version, we might also go on and search for the object on the floor if it cannot be found on the table. But why not have the robot replace an object from a warehouse if it cannot be found, or even mail order a new version? Obviously, it is very cumbersome to foresee all these eventualities when programming a robot. We therefore need a framework to make it easier to compose behaviors in real-time. This is where \textsl{mission planning} excels.

An example of mission planning is described in \cite{saito2011semantic}, where a robot is tasked to deliver a sandwich. The robot initially moves to a fridge, opens it and looks for a sandwich there, and then decides to take the elevator to the sandwich store in the basement of the University of Tokyo's engineering tower. Here, the robot is not only piecing together behaviors as it goes, but also using what is known as ``semantic planning'' to select the right actions, exploiting databases of common sense knowledge in textual form. How to represent such knowledge in an efficient and general manner is an active research topic in robotics and artificial intelligence and goes beyond the scope of this introductory book. However, we describe here the basic algorithms that will allow you to compose complex behaviors at run-time, thereby generating much more complex robot responses than could ever be accomplished using hand-coding.

\subsection{The General Problem Solver and STRIPS}

One of the first planning frameworks was introduced in 1959 as the ``General Problem Solver'' (GPS) \cite{newell1959report}, an idea that was popularized, refined and actually demonstrated on real robots later on as the ``STanford Research Institute Problem Solver'' (STRIPS) \cite{fikes1971strips}.
%
In STRIPS, a robotic problem is composed of:

\begin{enumerate}
\item A set of symbols that represent the \textsl{initial state}
\item A set of symbols that represent the desired \textsl{goal state}
\item A set of \textsl{actions}, each with a set of \textsl{preconditions} and a set of \textsl{postconditions}.
\end{enumerate}

An action's preconditions are a set of symbols that need to be part of the current state for the action to execute. An action's postconditions are the set of symbols the action creates or deletes, thereby affecting the state. In a nutshell, a STRIPS planner will work backwards from a desired goal state, find actions that have equivalent post-conditions, and then recursively try to satisfy these actions preconditions.

For a robot to be able to get you a sandwich, a suitable goal state could be \textsc{robot has sandwich}=\textsl{true}, some possible actions \textsc{pick sandwich from fridge} and \textsc{open fridge door}, and initial states \textsc{sandwich in fridge}=\textsl{true} and \textsc{fridge door closed}=\textsl{true}.  An action \textsc{open fridge door} would then require \textsc{fridge door closed}=\textsl{true} as a precondition and lead to \textsc{fridge door closed}=\textsl{false} as a postcondition. The action \textsc{pick sandwich from fridge} would require \textsc{sandwich in fridge}=\textsl{true} and \textsc{fridge door closed}=\textsl{false} as preconditions and result in \textsc{robot has sandwich}=\textsl{true}. A planner can now work backwards from the desired state to identify appropriate actions and then satisfy their preconditions recursively. In this case, the planner will discover the action \textsc{pick sandwich from fridge} and then identify \textsc{open fridge door} to satisfy the precondition \textsc{fridge door closed}=\textsl{false}.

Formally, an instance of a STRIPS problem is a quadruple $\langle P, O, I, G \rangle$:
\begin{itemize}
\item $P$ is a set of propositional variables that can be either true or false and exhaustively describe the world the robot operates in.
\item $O$ is a set of operators, each itself a quadruple $\langle \alpha, \beta, \gamma, \delta \rangle$ whose entries determine the set of conditions that need to be true ($\alpha$) and that need to be false ($\beta$) for the action to take place, and a set of conditions that will be true ($\gamma$) and that will be false ($\delta$) if the action is successful.
\item $I$ is a set of conditions $I \subset P$ that are initially true and define the initial state, all other conditions are initially false.
\item $G$ is a tuple $\langle N, M\rangle$ in which $N$ is a set of conditions that need to be true and $M$ is a set of conditions that need to be false.
\end{itemize}

Formalizing the sandwich example from above using this framework is left to the reader as an exercise. It becomes quickly clear that the devil is in the details here. For example, we have assumed that the positions of the robot are resolved by the action themselves. In practice, the STRIPS plan would also require additional preconditions on the robot's location, e.g., \textsc{robot at fridge}=\textsl{true} which would then be resolved by the planner. An observant reader might have also noticed that great care needs to be taken in determining which variables an action actually affects and specifying the precisely desired goal state. For example, the plan as described above will lead to a scenario in which the fridge door remains open.

A common extension in a STRIPS instance is to parametrize locations and objects. In this case, \textsc{robot at fridge} would become \textsc{robot at X}. Values for ``X'' can then be substituted at run-time, for example when evaluating the preconditions of \textsc{open fridge}. Similarly, a STRIPS plan might be formulated to satisfy hunger, substituting the sandwich with other victuals. Managing these different categories, their contexts, and trade-offs between qualities of different outcomes becomes quickly challenging and is an ongoing subject of research.

Other challenges with STRIPS planning are exogenous events that change the environment state, such as a draft that closes the fridge door after the robot has opened it or probabilistic operators that might lead to different outcomes for an operator depending on chance. These are situations that are well covered by the Behavior Tree framework, making the combination of BT and STRIPS planning particularly compelling (see also \cite{colledanchise2018behavior}, Chapter 7).

\section*{Take-home lessons}
\begin{enumerate}
\item Writing a robotic program is fundamentally different from regular computer programs as the program flow needs to be coordinated with the actual physics of the world taking its course.
\item Discrete states are an abstraction of the physical world, and more complex behaviors lead to exponential growth in the number of states and transitions between them.
\item There exist multiple programming paradigms that make managing large number of states and possible transitions between them more manageable, but require an increasing amount of software and thereby increasing computational infrastructure.
\end{enumerate}

\section*{Exercises}\small

\begin{enumerate}
\item A differential wheel robot has three downward-facing light sensors at its tip. The sensors are spaced such that the robot can detect a black line on a white ground. Derive the equations for a line-following robot using the Braitenberg formalism.
\item Derive a control scheme that combines line following and obstacle avoidance. Discuss your choices assuming that the robot has to avoid obstacles at all cost.
\item Use a robotic simulator of your choice to implement basic phototaxis and obstacle avoidance.
\item Use a robotic simulator of your choice to implement basic wall-following behavior
\item Implement a light-following robot in a simulator of your choice and manually control it toward the light. Train a neural network for a Braitenberg controller using this data.
\item Implement a simple finite state machine that combines obstacle avoidance, phototaxis and wall-following and is capable to escape from a U-shaped obstacle
\item A FSM implements the following behavior: perform photo-taxis until an obstacle is it; then perform wall-following for 10 time steps. Draw an appropriate Finite State Machine. How many states do you need?
\item A robot runs at a 100ms loop time. Performing sensor readings takes 3ms, odometry computations 15ms, and executing logic takes 30ms on average. Which of these operations is likely to fail if the task logic takes 80ms?
\item Formulate both a Finite State Machine and a Behavior tree for the game ``Rats Life'', label each state and conditional transition, and compare the two representations.
\item Construct a behavior tree that enables a single robot manipulator arm to sort red, green, and blue blocks on a table into bins by color.
\item Construct a behavior tree that enables a bi-manual (two manipulator arm) robot to sort red, green, and blue blocks on a table into bins by color with both arms.
\item Formally describe a STRIPS instance for a robotic sandwich retrieval problem in which the fridge door is closed after the robot has retrieved a sandwich. 
\end{enumerate}\normalsize
